Audio scene segmentation using multiple features, models and time scales

نویسندگان

Hari Sundaram

Shih-Fu Chang

چکیده

In this paper we present an algorithm for audio scene segmentation. An audio scene is a semantically consistent sound segment that is characterized by a few dominant sources of sound. A scene change occurs when a majority of the sources present in the data change. Our segmentation framework has three parts: (a) A definition of an audio scene (b) multiple feature models that characterize the dominant sources and (c) a simple, causal listener model, which mimics human audition using multiple time-scales. We define a correlation function that determines correlation with past data to determine segmentation boundaries. The algorithm was tested on a difficult data set, a 1 hour audio segment of a film, with impressive results. It achieves an audio scene change detection accuracy of 97%.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Video Scene Segmentation using Video and Audio Features

In this paper we present a novel algorithm for video scene segmentation. We model a scene as a semantically consistent chunk of audio-visual data. Central to the segmentation framework is the idea of a finite-memory model. We separately segment the audio and video data into scenes, using data in the memory. The audio segmentation algorithm determines the correlations amongst the envelopes of au...

متن کامل

Video Scene Segmentation System Using Audio-visual Features

This work demonstrates a new approach to video temporal segmentation into scenes. The utilized technique is based on an audio-visual extension of the well-known method of the Scene Transition Graph (STG). This multi-modal extension exploits both lowand high-level audio-visual descriptors to construct distinct STGs. These STGs are employed into a probabilistic framework that is used for estimati...

متن کامل

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

Analyzing motion patterns in traffic videos can be exploited directly to generate high-level descriptions of the video contents. Such descriptions may further be employed in different traffic applications such as traffic phase detection and abnormal event detection. One of the most recent and successful unsupervised methods for complex traffic scene analysis is based on topic models. In this pa...

متن کامل

A Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images

Nowadays, ground vehicle monitoring (GVM) is one of the areas of application in the intelligent traffic control system using image processing methods. In this context, the use of unmanned aerial vehicles based on thermal infrared (UAV-TIR) images is one of the optimal options for GVM due to the suitable spatial resolution, cost-effective and low volume of images. The methods that have been prop...

متن کامل

Structural and Semantic Analysis of Video

In this paper we discuss our recent research and open issues in structural and semantic analysis of digital videos. Specifically, we focus on segmentation, summarization and classification of digital video. In each area, we also emphasize the importance of understanding domain-specific characteristics. In scene segmentation, we introduce the idea of a computable scene as a chunk of audio-visual...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2000

Audio scene segmentation using multiple features, models and time scales

نویسندگان

چکیده

منابع مشابه

Video Scene Segmentation using Video and Audio Features

Video Scene Segmentation System Using Audio-visual Features

Traffic Scene Analysis using Hierarchical Sparse Topical Coding

A Hybrid Algorithm based on Deep Learning and Restricted Boltzmann Machine for Car Semantic Segmentation from Unmanned Aerial Vehicles (UAVs)-based Thermal Infrared Images

Structural and Semantic Analysis of Video

عنوان ژورنال:

اشتراک گذاری